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ABSTRACT 

This paper describes the Seventh Data Release of the Sloan Digital Sky Survey (SDSS), marking 
the completion of the original goals of the SDSS and the end of the phase kno'wn as SDSS-II. It 
includes 11663 deg^ of imaging data, -with most of the ~ 2000 deg^ increment over the previous data 
release lying in regions of lo"w Galactic latitude. The catalog contains five-band photometry for 357 
million distinct objects. The survey also includes repeat photometry on a 120° long, 2.5° -wide stripe 
along the Celestial Equator in the Southern Galactic Cap, -with some regions covered by as many as 
90 individual imaging runs. We include a coaddition of the best of these data, going roughly t"wo 
magnitudes fainter than the main survey over 250 deg^. The survey has completed spectroscopy over 
9380 deg^; the spectroscopy is no'w complete over a large contiguous area of the Northern Galactic 
Cap, closing the gap that ■was present in previous data releases. There are over 1.6 million spectra in 
total, including 930,000 galaxies, 120,000 quasars, and 460,000 stars. 

The data release includes improved stellar photometry at lo'w Galactic latitude. The astrometry 
has all been recalibrated -with the second version of the USNO CCD Astrograph Catalog (UCAC-2), 
reducing the rms statistical errors at the bright end to 45 milli-arcseconds per coordinate. We further 
quantify a systematic error in bright galaxy photometry due to poor sky determination; this problem 
is less severe than previously reported for the majority of galaxies. Finally, "we describe a series of 
improvements to the spectroscopic reductions, including better flat-fielding and improved "wavelength 
calibration at the blue end, better processing of objects "with extremely strong narro'w emission lines, 
and an improved determination of stellar metallicities. 
Subject headings: Atlases — Catalogs — Surveys 
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1. OVERVIEW OF THE SLOAN DIGITAL SKY SURVEY 

The Sloan Digital Sky Survey (SDSS; York et al. 2000) saw first light a decade ago, with the goals of obtaining 

CCD imaging in five broad bands over 10,000 deg^ of high-latitude sky, and spectroscopy of a million galaxies and 
one hundred thousand quasars over this same region. With this, its seventh public data release, these goals have 
been realized. The survey facilities have also been used to carry out a comprehensive imaging and spectroscopic 
survey to explore the structure, composition, and kinematics of the Milky Way Galaxy (Sloan Extension for Galactic 
Understanding and Exploration; SEGUE; Yanny et al. 2009), and a repeat imaging survey that has discovered almost 
500 spectroscopically confirmed Type la supernovae with superb light curves (Frieman et al. 2008; Holtzman et al. 
2008). 
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The SDSS uses a dedicated wide-field 2.5m telescope (Gunn et al. 2006) located at Apache Point Observatory (APO) 
near Sacramento Peak in Southern New Mexico. The telescope uses two instruments. The first is a wide-field imager 
(Gunn et al. 1998) with 24 2048 x 2048 CCDs on the focal plane with 0.396" pixels, that covers the sky in drift scan 
mode in five filters in the order riuzg (Fukugita et al. 1996). The imaging is done with the telescope tracking great 
circles at the sidereal rate; the effective exposure time per filter is 54.1 seconds, and 18.75 deg^ are imaged per hour 
in each of the five filters. The images are mostly taken under good seeing conditions (the median is about 1.4" in r) 
on moonless photometric nights (Hogg et al. 2001); the exceptions are a series of repeat scans of the Celestial Equator 
in the Fall for a supernova search (Frieman et al. 2008), as is described in more detail in § 3.2. The 95% completeness 
limits of the images are u,g,r,i,z — 22.0,22.2,22.2,21.3,20.5, respectively (Abazajian et al. 2004), although these 
values depend as expected on seeing and sky brightness. The images are processed through a series of pipelines that 
determine an astrometric calibration (Pier et al. 2003) and detect and measure the brightnesses, positions and shapes 
of objects (Lupton et al. 2001; Stoughton et al. 2002). The astrometry is good to 45 milli-arcseconds (mas) rms per 
coordinate at the bright end, as described in more detail in § 4.4. The photometry is calibrated to an AB system 
(Oke & Gunn 1983), and the zeropoints of the system are known to 1-2% (Abazajian et al. 2004). The photometric 
calibration is done in two ways, by tying to photometric standard stars (Smith et al. 2002) measured by a separate 
0.5-m telescope on the site (Tucker et al. 2006; Ivezic et al. 2004), and by using the overlap between adjacent imaging 
runs to tic the photometry of all the imaging observations together, in a process called ubercalibration (Padmanabhan 
et al. 2008). Results of both processes are made available; with this data release, the ubercalibration results, which are 
uncertain at the ~ 1% level in griz and 2% in u, are now the default photometry made available in the data release 
described in this paper. 

The photometric catalogs of detected objects are used to identify objects for spectroscopy with the second of the 
instruments on the telescope: a 640-fiber-fed pair of multi-object double spectrographs, giving coverage from 3800A to 
9200A at a resolution of A/AA ~ 2000. The objects chosen for spectroscopic follow-up are selected based on photometry 
corrected for Galactic extinction following Schlegel, Finkbeiner, & Davis (1998; hereafter SFD), and include: 

• A sample of galaxies complete to a Petrosian (1976) magnitude limit of r = 17.77 (Strauss et al. 2002); 

• Two deeper samples of luminous red ellipticals selected in color-magnitude space to r = 19.2 and r = 19.5, 
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respectively, which produce an approximately volume-limited sample to z = 0.38, and a flux-limited sample 
extending to ^ = 0.55, respectively (Eisenstein et al. 2001); 

• Flux limited samples of quasar candidates, selected by their non-stellar colors or FIRST (Becker ct al. 1995) 
radio emission to i = 19.1 in regions of color space characteristic of 2 < 3 quasars, and to i = 20.2 for quasars 
with 3 < < 5.5 (Richards et al. 2002); 

• A variety of ancillary samples, including optical counterparts to ROSAT-detected X-ray sources (Anderson et 

al. 2007): 

• Stars for spectrophotometric calibration and telluric absorption correction, as well as regions of blank sky for 
accurate sky subtraction; 

• A variety of categories of stellar targets with a series of color and magnitude cuts for measurements of radial 
velocity, metallicity, surface temperature, and Galactic structure as part of SEGUE (Yanny ct al. 2009). 

These targets are arranged on tiles of radius 1.49°, with centers chosen to maximize the number of targeted objects 
(Blanton et al. 2003). Each tile contains 640 objects, and forms the template for an aluminum spectroscopic plate, 
in which holes arc drilled to hold optical fibers that feed the spectrographs. Spectroscopic exposures arc 15 minutes 
long, and three or more are taken for a given plate to reach pre-defined requirements of signal- to- noise ratio (S/N), 
namely {S/N)^ > 15 per 1.5A pixel for stellar objects of fiber magnitude g = 20.2, r = 20.25 and i = 19.9. For the 
SEGUE faint plates, the exposures are considerably deeper, and typically consist of eight 15-minute exposures, giving 
(S/N)'^ 100 at the same depth (Yanny ct al. 2009). 

Spectra arc extracted and calibrated in wavelength and flux. The typical S/N of a galaxy near the main sample 
flux limit is 10 per pixel. The broad-band spectrophotometric calibration is accurate to 4% rms for point sources 
(Adelman-McCarthy et al. 2008), and the wavelength calibration is good to 2 kms~^. The spectra are classified and 
redshifts determined using a pair of pipelines (Stoughton et al. 2002; Subbarao et al. 2002), which give consistent 
results 98% of the time; the discrepant objects tend to be of very low S/N, or very unusual objects, such as extreme 
BALs, superposed sources, and so on. The vast majority of the spectra of galaxies and quasars yield reliable redshifts; 
the failure rate is of order 1% for galaxies, and slightly larger for quasars. The stellar targets are further processed by a 
separate pipeline (Lee et al. 2008a, b; AUende Prieto et al. 2008a) which determines surface temperatures, metallicities, 
and gravities. 

The resulting catalogs are stored and distributed via a database accessible on the web (the Catalog Archive Server, 
CAS^°^; Thakar et al. 2008), and the images and flat files are available in bulk through the Data Archive Server 
(DAS) 105. 

The SDSS saw first light in May 1998, and started routine operations in April 2000. It was originally funded for five 
years of operations, but had not completed its core goals of imaging and spectroscopy of a large contiguous area of 
the Northern Galactic Cap by 2005. The survey was extended for an additional three years, with the additional goals 

of the SEGUE and the supernova surveys mentioned above. The extended program is known as SDSS-II, and the 
component of SDSS-II that represents the completion of SDSS-I is known as the Legacy Survey. SDSS-II observations 
were completed in July 2008. 

The SDSS data have been made public in a series of yearly data releases (Stoughton et al. 2002; Abazajian et al. 
2003, 2004, 2005, 2006; Adelman-McCarthy et al. 2007, 2008; hereafter the EDR, DRl, DR2, DR3, DR4, 'dR5, and 
DR6 papers, respectively). The most recent of these papers described the Sixth Data Release (DR6), which included 
data taken through July 2006. The present paper describes the Seventh Data Release (DR7), including data taken 
through the end of SDSS-II in 2008 July, and thus represents two additional years of data. The data releases are 
cumulative; DR7 includes all data included in the previous releases as well. In § 2, we describe the footprint of this 
survey; most importantly, we have completed our goals of: 

• contiguous imaging and spectroscopy over 7500 deg^ of the Northern Galactic Cap (the Legacy survey); 

• imaging and spectroscopy of stellar sources over an additional 3500 deg^ at lower Galactic latitudes to study the 
structure of the Milky Way, and 

• repeat imaging of > 250 deg^ on the Celestial Equator in the Fall months to discover Type la supernovae with 
0.1 <z< 0.4. 

In § 3, we describe the repeat scans on the Celestial Equator, including a co- addition of the images to reach about 

two magnitudes deeper than the main survey. In § 4, we present improvements in the processing of the imaging data, 
including improved stellar photometry at low Galactic latitudes, an astrometric recalibration, and improvements in 
our photometric redshift algorithms for galaxies. The DR6 paper described a problem with the photometry of bright 
galaxies; we explore this further in § 5. In § 6, we discuss improvements in the spectroscopic processing of the data. 
The DR6 paper described improvements in the wavelength and spectrophotometric calibration; we have implemented 
further refinements which are important in the determination of accurate stellar parameters from the spectra. 
We conclude in § 7 with a discussion of the future of the SDSS project. 
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TABLE 1 
Coverage and Contents of DR7 



Imaging 

Imaging area in CAS 11663 deg^ 

Imaging catalog in CAS 357 million unique objects 

Legacy footprint area 8423 deg^ 

(7646 deg2 in N. Galactic Cap) 

Legacy imaging catalog 230 million unique objects 

585 million entries (including duplicates) 

SEGUE footprint area, available in DAS'' 3500 deg^ (more than double DR6) 

SEGUE footprint area, available in CAS 3240 deg^ 

SEGUE imaging catalog 127 million unique objects 

M31, Perseus, Sagittarius scan area ~ 46 deg^ 

Southern Equatorial Stripe with > 70 repeat scans ~ 250 deg^ 

Commissioning ("Orion") data 832 deg^ 



Spectroscopy 



Spectroscopic footprint area 9380 deg^ 

Legacy 8032 deg^ 

SEGUE 1348 deg^ 

Total number of plate observations (640 fibers ea<;h) 2564 

Legacy survey plates 1802 

SEGUE and special plates 676 

Repeat observations of plates 86 

Total number of spectra'' 1,630,960 

Galaxies 929,555 

Quasars 121,363 

Stars 464,261 

Sky 97,398 

Unclassifiable 28,383 

Spectra after removing skies and duplicates 1,440,961 



^ Includes regions of high stellar density, where the photometry is likely to be poor. Sec text 
for details. This area also includes some regions of overlap,'' Spectral classifications from the 
spectrold code; numbers include duplicates. 

2, SURVEY FOOTPRINT 

Table 1 summarizes the contents of DR7, giving the imaging and spectroscopic sky coverage and number of objects. 
The imaging footprint has increased by roughly 22% since DR6 (most of it outside the contiguous area of the North 
Galactic Cap), and the number of spectra has increased by 29%. 

The imaging for the Legacy survey was substantially complete with DR6. In DR7, wo include imaging of a few small 
gaps that were missed in the contiguous region of the North Galactic Cap, and repeat observations of a few regions of 
the sky which had particularly poor seeing in previous data releases. The total footprint has increased by less than 10 
deg^ in total. The Legacy imaging footprint is visible as the large contiguous gray area on the left side of the upper 
panel of Figure 1, together with the three gray stripes visible on the right side. The principal augmentation of the 
imaging data in DR7 are the stripes which are part of the SEGUE survey. They arc indicated in red in the figure, 
and increase the SDSS imaging footprint by roughly 2000 deg^ over DR6. Note that many of these cross the Galactic 
plane (indicated by the sinuous line crossing the figure). Unlike DR6, the union of the Legacy and SEGUE data are 
now available in a single database in CAS in DR7. 

These data have been recalibrated using ubercalibration (Padmanabhan ct al. 2008) using the overlap between 
adjacent scans; the resiilting photometry is now the default photometry found in the CAS. We also make available 
the original photometry calibrated by the auxiliary Photometric Telescope (Tucker et al. 2006). The ubercalibration 
solution was regenerated using all the imaging data, but the changes are tiny from the ubercalibration results published 
in DR6: 0.001 mag rms in griz and 0.003 mag in u. The ubercalibrated photometry zeropoints are defined to be the 
same as that measured from the Photometric Telescope. 

The green and blue patches indicate supplementary imaging stripes, which contain scans over M31 or in its halo, 
through the center of the Perseus cluster of galaxies, over the low-latitude globular cluster M71, near the South Galactic 
Pole, along the orbit of the Sagittarius Tidal Stream, and through the star-forming regions of Orion (Finkbeiner et al. 
2004). In addition, there are a number of scans at angles perpendicular, or at an oblique angle, to the regular Legacy 
or SEGUE imaging stripes. These scans are used in the ubercalibration procedure to tie the zeropoints of the stripes 
together and to determine the flat-fields. 

The lower panel in Figure 1 shows the coverage of spectroscopy in DR7; the light gray area shows the increment 
in the Legacy survey over DR6. Most importantly, the gap cutting the North Galactic Cap in two pieces in previous 
data releases has been closed; we now have complete spectroscopy of our principal galaxy and quasar targets over a 
contiguous area of roughly 7500 deg^. An additional dozen plates were observed to fill holes in the nominally contiguous 
regions in DR6. Adding in the three stripes in the Southern Galactic Cap, the Legacy spectroscopy footprint is 8032 
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Fig. 1. — The distribution on the sky of the data included in DR7 (upper panel: imaging; lower panel: spectra), shown in an Aitoff 
equal-area projection in J2000 Equatorial Coordinates. The Galactic Plane is the sinuous line that goes through each panel. The center 
of each panel is at « = 120° = 8*^, and the plots cut off at (5 = —25°, below which the SDSS did not extend. The Legacy imaging 
survey covers the contiguous area of the Northern Galactic Cap (centered roughly at a = 200°, <5 = 30°), as well as three stripes (each of 
width 2.5°) in the Southern Galactic Cap. In addition, several stripes (indicated in blue in the imaging data) are auxiliary imaging data, 
while the SEGUE imaging scans are indicated in red. The green scans are additional runs as described in Finkbeiner et al. (2004). In 
the spectroscopy panel, the lighter regions indicate that area in the Northern Galactic Cap which is new to DR7; note that the Northern 
Galactic Cap is now contiguous. Red points indicate SEGUE plates, and blue points indicate other non-Legacy plates (mostly as described 
in the DR4 paper). 

deg^, a 26% increment over DR6. 

In addition, spectroscopy was carried out using a series of target selection algorithms designed to find stars of a wide 
variety of types as part of the SEGUE project (DR6 paper; Yanny et al. 2009). These targets were drawn from both 
the SEGUE and Legacy imaging, and are shown in red in the lower panel of Figure 1. As some of these are lost in the 
density of Legacy spectra, we show the distribution of SEGUE and other non-Legacy spectra in Galactic coordinates 
in Figure 2. 

Finally, as described in Yanny et al. (2009), we carried out spectroscopy of stars in 12 open and globular clusters 
to calibrate the measurements of stellar parameters in SEGUE (Lee et al. 2008a, b). Many of these clusters are 
sufficiently close that the giant branches are brighter than the photometric saturation limit of SDSS, so the targets for 
these plates were selected from the literature. Indeed, the spectrographs would saturate as well with our standard 15- 
minute exposures, so these observations had individual exposure times as short as one or two minutes. Without proper 
flux calibrators or exposure of bright sky lines to set the zeropoint of the wavelength scale, the spectrophotometry 
and wavelength calibration of the spectra on these plates are often quite inferior to that of the main survey, and these 
plates are available only in the DAS, not the CAS. 

As described in more detail below, the 2.5° stripe centered on the Celestial Equator was imaged multiple times 
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Fig. 2. — The distribution on the sky of SEGUE (red) and other non-Legacy (blue) spectroscopic observations, here plotted in Galactic 
coordinates. The contiguous blue stripe across the bottom is Stripe 82, along the Celestial Equator. As described in the DR4 paper, Stripe 
82 includes extensive spectroscopy of a number of different types of targets outside the Legacy survey. 

throughout SDSS and SDSS-II. Each 2.5° wide stripe is observed by a pair of offset strips to cover the fuU width (York 
et aL 2000); the coverage of the two strips of Stripe 82 is shown in Figure 3. The data are shown both for the subset 
of data included in a deep coaddition (lower set of curves; § 3.3), and all scans, including those taken under non-ideal 
conditions for the supernova survey (§ 3.2; Frieman et al. 2008). 

3. ADDITIONAL IMAGING PRODUCTS AND DATABASES 
3.1. The Runs Database 

The SDSS imaging survey was primarily designed to give a single pass across the sky, thus in the CAS, each 
photometric measurement is flagged either Primary or Secondary. Primary objects designate a unique set of detections 
(i.e., without duplicates) using the geometric boundaries of survey stripes^''^. The set of Secondary objects includes 
repeat observations of the same object in overlapping strips an stripes. Primary objects are associated with a run 
and field which is the primary source of imaging data at that position. In DR7, the union of the Legacy and SEGUE 
footprints serves as the Primary footprint; a quantity inLegacy in the f ieldQA table in CAS indicates those objects 
which lie within the original Legacy Northern Galactic Cap Survey ellipse, as defined in York et al. (2000). Legacy 
imaging can also be distinguished by the stripe number for each run; stripes 9-44, 76, 82 and 86 are in the Legacy 
survey, all others are SEGUE stripes or other miscellaneous pieces of sky (Figure 1). 

While resolving the sky into a seamless Primary region of unique detections of objects is ideal for many science 
queries, it is sometimes convenient to query data by run without regard to the way the survey resolves overlaps and 
imposes the boundaries of the edge of the survey. These boundaries are restricted to matched pairs of North and South 
strips in the main DR7 CAS. Therefore in many runs, several fields at the beginning or end which do not have a match 
in the corresponding other strip are not included in the main CAS. Thus we have now made available a separate runs 
database within the CAS, which includes all fields in all runs, and which allows one to query objects by which run 
they are imaged in. 

The runs database contains 530 complete runs from SDSS-I and SDSS-II, where Primary is set strictly based on 
geometric limits within each scan, regardless of overlapping runs or stripes. The runs database also contains several 
scans outside the regular DR7 Legacy or SEGUE footprints. For example, Stripe 205 is covered by runs 4334, 4516, 6751 
and 6794, and follows the Sagittarius Stream, which is in three pieces, the first running from {a,S) = (240°,— 15°) 



See http://www.sdss.org/dr7/algoritlms/resolve.htiiil for a detailed explanation. 
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Right Ascension 

Fig. 3. — Stripe 82, the Equatorial stripe in the South Galactic Cap, has been imaged multiple times. The lower pair of curves show 
the number of scans covering a given right ascension in the North and South strip that are included in the coaddition (mostly data taken 
through 2005). In addition, Stripe 82 has been covered many more times as part of a comprehensive survey for 0.05 < z < 0.35 supernovae, 
although often in conditions of poor seeing, bright moon, and/or clouds; the total numbers of scans at each right ascension in the North and 
South strip are indicated in the upper pair of curves. All these data have been flux-calibrated, as discussed in the text, and are available 
(together with the coadd itself) in the strip682 database. 

to (200°, +10°), the second centered at (135°, +35°), and the last (overlapping several other runs) which ends at 
(45°, +10°). 

3.2. The Stripe 82 Database 

The SDSS stripe along the Celestial Equator in the Southern Galactic Cap ("Stripe 82") was imaged multiple times in 
the Fall months. This was first carried out to allow the data to be stacked to reach fainter magnitudes, and through Fall 
2004, these data were taken only under optimal seeing, sky brightness, and photometric conditions (i.e., the conditions 
required for imaging in the Legacy Survey; York et al. 2000). There were 84 such runs made public in previous data 
releases. In Fall 2005, 2006, and 2007, 219 additional imaging runs were taken on Stripe 82 as part of the SDSS 
supernova survey (Frieman et al. 2008), often under less optimal conditions: poor seeing, bright moonlight, and/or 
non-photometric conditions. These data have been photometrically calibrated following the prescription of Bramich 
et al. (2008), whereby the photometry of bright stars is tied to that of photometric data on a field- by- field basis (see 
Ivezic et al. 2007 for a similar approach) . Bramich et al. solved for photometric offsets both parallel and perpendicular 
to the scan direction in data from a given CCD; we found that the term perpendicular to the scan direction added 
little, and we did not include it here. As Bramich et al. (2008) show, the resulting photometric calibration is good to 
0.02 mag at the bright end in up to 1 mag of atmospheric extinction. Of course, under non-optimal conditions, these 
data will not necessarily reach as deep as normal survey images. 

SDSS judges photometricity of a given night by monitoring fluctuations in the night sky measured by a wide area 
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infrared camera (the "cloud camera") sensitive at lO/xm, where clouds are emissive (Hogg et al. 2001). If the sky 
fluctuations are small and constant, then the night is photometric. Clouds cause the fluctuations to increase. Plots 
of cloud cover and seeing for most nights on which Stripe 82 was observed are available as part of the DR7 web 
documentation listing all Stripe 82 scans. In addition, for those runs which the cloud camera indicated as non- 
photometric, we examined the fluctuations in the zeropoint for each CCD in the camera as a function of time using 
the photometric calibration procedure of Bramich et al. (2008). These zeropoint values are available in the CAS; 
rms variations of more than 0.1 mag are an indication of considerable variable cloud cover, and a value of more than 
1 magnitude suggests that the approximate calibration procedure of Bramich et al. (2008) breaks down, and the 
resulting photometry should be regarded with caution. All 303 runs covering Stripe 82 are made available as part of 
the Stripe82 database, which is structured hke the runs database. 



We have carried out a coaddition of the repeat imaging scans on Stripe 82 taken through Fall 2005 under the best 
conditions (see below). The coaddition includes a total of 122 runs, covering any given piece of the > 250 deg^ 
area between 20 and 40 times (Figure 3), and the results are made available in the Stripe82 database as well. The 
coaddition runs are designated 100006 (South strip) and 200006 (North strip) respectively in the DAS, and 106 and 
206 in the CAS. 

The coaddition is described in detail in Annis et al. (2009); see also Jiang et al. (2008). From the list of runs on 
Stripe 82 taken through the Fall 2005 season, all fields with seeing in the r band worse than 2" FWHM, r-baiid sky 
brightness brighter than 19.5 magnitudes in one square arcsecond, or whose photometric correction a la Bramich et 
al. (2008; see above) was greater than 0.2 mags were excised; this rejected 32% of the available data. The individual 
runs were remapped onto a uniform astromctric coordinate system. Interpolated pixels in each individual run (e.g., for 
bad columns, bleed trails, and cosmic rays) were masked in the coaddition process. The sky was subtracted from each 
frame, and the images coadded with weights for each frame proportional to the transparency and inversely proportional 
to the square of sky noise and seeing on each frame. Strongly discrepant pixels were clipped in the coaddition. The 
effective seeing FWHM is ~ 1.2" (for the southern strip of the stripe) and ~ 1.3" (for the northern strip). 

The resulting coadded images were run through the SDSS photometric pipeline, yielding the catalogue made available 
in the Stripe82 database. Rather than deriving the Point Spread Function (PSF) from scratch, we synthesized the 
PSF at each point in the sky by taking the suitably weighted sum of the PSFs output by the SDSS photometric 
pipeline from each of the individual runs. 

Color-color diagrams of stars and counts of stars and galaxies as a function of magnitude demonstrates that the 
photometry reaches roughly two magnitudes fainter than single SDSS scans, similar to what is expected given the 
number of runs in the coadd. We have found that star-galaxy separation is improved over that in the single scans, in 
that the cut can be made closer to the stellar locus. In the main survey, objects with mpsF — mmodei > 0.145 are 
flagged as galaxies in a given band. However, the stellar peak in the PSF - model magnitude difference distribution 
in the coadd is much narrower, allowing objects with mpsF — mmodei > 0.03 in r to be flagged as galaxies. 

The coaddition does not properly propagate information on saturated pixels in individual runs, and therefore the 
photometry of objects brighter than roughly r = 15.5 is suspect. Unfortunately, there is no processing flag that one 
can use to identify such data: we recommend a simple magnitude cut. 

The SDSS photometry is quoted in terms of asinh magnitudes, as described by Lupton, Gunn, & Szalay (1999), 
whereby the logarithmic magnitude scale transitions to a linear scale in flux density / at low signal-to-noise ratio: 



The magnitude at which this transition occurs is set by the quantity 5, which is roughly the fractional noise in the sky 
in a PSF aperture in 1" seeing (EDR paper). Here /o = 3631 Jy, the zeropoint of the AB flux scale. The quantity b 
for the coaddition is given in Table 2, along with the asinh magnitude associated with a zero flux object. Compare; 
with the equivalent numbers for the main survey, given in Table 21 of the EDR paper. Table 2 also lists the flux 
corresponding to 10 /o6, above which the asinh magnitude and the traditional logarithmic magnitude differ by less 
than 1% in flux. 

As with the main survey, it is important to use the various processing flags output by the photometric pipeline (e.g., 
as recommended by Richards et al. 2002) to reject spurious objects, and to select objects with reliable photometry. 



3.3. Going Deep on Stripe 82 
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4. IMPROVEMENTS IN PROCESSING OF IMAGING DATA 
4.1. New Reductions of SEGUE Imaging Data and Crowded Fields 

As was noted in the DR6 paper, the SDSS imaging pipeline (photo) was designed to analyze data at high Galactic 
latitudes, and is not optimized to handle very crowded fields. The Legacy survey is restricted to high latitudes, and 
photo performs adequately throughout the Legacy footprint. However, at lower latitudes, when the density of stars 
brighter than r = 21 grows above 5000 deg~^, the pipeline is known to fail, as it is imable to find sufficiently isolated 
stars to measure an accurate PSF, and the deblender does poorly with overly crowded images. Many of the SEGUE 
scans probe these low latitudes (Figure 1), and we therefore adapted an alternative stellar photometry code called 
PSPhot developed by the Pan-STARRS team (Kaiser et al. 2002; Magnier 2006) to be used for these runs. In brief, 
we first run this code, and then run photo using the list of objects detected by PSPhot as input to help photo's object 
finder in crowded regions. This approach thus provides two sets of photometry at low latitudes. 

Like e.g., DAOPHOT (Stetson 1987), PSPhot begins with the assumption that every object is unresolved, and 
therefore does a better job than photo in crowded stellar regions. It uses an analytical model based on Gaussians 
to describe the basic PSF shape, with parameters which may vary across the field of the image to follow the PSF 
variations. It also uses a pixel-based representation of the residuals between the PSF objects and the analytical model, 
which is also allowed to vary across each field. Candidate PSF stars are selected from the collection of bright objects 
in the frame by searching for a tight clump in the distribution of second moments. After rejecting outliers, the PSF 
fit parameters are used to constrain the spatial variations in the PSF model. 

Unlike photo, PSPhot processes each frame separately (without any requirement of continuity of PSF estimation 
across frame boundaries), and each filter separately (without any requirement that the list of objects between the 
separate filters agree). The pipefine outputs positions and PSF magnitudes (and errors) for each detected object; the 
results are found in the PsObjAll table in the CAS. The resulting photometry is then matched between filters using 
a 1" matching radius. While the estimated PSF errors output by photo include a term from the uncertainty in the 
PSF fitting, this component is not included in the errors reported by PSPhot. 

We then run photo, asking it to carry out photometry at the position of each object detected by PSPhot, in addition 
to the positions of objects photo itself detects. This allows photo to do a much better job of distinguishing individual 
objects in crowded regions. In addition, the pipeline is fine-tuned to less aggressively look for overlap between adjacent 
objects, and not to give up as soon as it does at high latitude when faced with deblending large numbers of objects. 
We describe below how the photometry directly out of PSPhot, and that from photo, compare. 

The SDSS PSF photometry had an offset applied to it to make it agree with aperture photometry of bright stars 
within a radius of 7.43"; this large aperture photometry was in fact what was used by ubercalibration to tie all the 
photometry together (Padmanabhan et al. 2008). In crowded regions, finding sufficiently isolated stars to measure 
aperture photometry becomes difficult. PSPhot photometry was forced to agree with these large- aperture magnitudes 
for bright stars; this was done in practice by determining, for each CCD in the imaging camera for each run, the 
average aperture correction needed to put the two on the same system, using stars at Galactic latitude \b\ > 15°, 
where crowding effects are less severe. 

If any part of a SEGUE imaging run extended to |6| < 25°, the entire run was processed through the photo and 
PSPhot code. This sample includes most (but not all) of the SEGUE imaging runs. These PSPhot+photo processed 
runs, designated with rerun=648 in the DR7 CAS and DAS, are declared the Best reduction of these SEGUE runs. 
There is also an inferior Target version of these SEGUE runs which was used to design SEGUE spectroscopic plates; 
it is based on photo alone, as the PSPhot code was unavailable at the time the plates were designed. The Target 
reductions have rerun=40, and are segregated to the SEGUETARGDR7 database. 

This processing revealed a problem with photo. In crowded regions, one cannot find sufficiently isolated stars to 
measure counts through such a large aperture, and in practice, the code corrected PSF magnitudes to an aperture 
photometry radius of 3.00" instead, whenever any part of a given run dipped below |6| = 8°. Thus the aperture 
correction was underestimated, typically by 0.03-0.06 mag, depending on the seeing. This was not a problem for 
any of the Legacy imaging scans, but is very much an issue for the SEGUE runs. Fortunately, there is a strong 
correlation, in a given detector, between the aperture correction from a 3.00" aperture to a 7.43" aperture (as measured 
on high-latitude fields), and the seeing. We therefore applied this correction after the fact to the photo PSF, de 
Vaucouleurs, exponential, and model magnitudes for all SEGUE runs affected by this problem. This was carried out 
before ubercalibration, so these runs are photometrically calibrated in a consistent way. 

4.2. Comparison o/ photo and PSPhot Photometry 

The quality of the photometry produced by PSPhot and by photo with the PSPhot-detected objects as input, was 
evaluated by comparing the magnitudes computed by the two methods. Within each field, we calculated the median 
of the difference of PSF magnitudes for stars with 14 < u, g, r,i,z < 20. This median difference had an rms of 0.014 
mag. Fields with a difference greater than 0.08 mag are suspect, and further investigation is needed to determine 
which of the two pipelines might be at fault. We followed McGehee et al. (2005) to measure reddening-free colors of 
the same stars that track the stellar locus: 

Qgri = {g-r)- Egri {r ~ i), (2) 
Qriz — (r z) Efiz (i 

(3) 
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Fig. 4. — The distribution of median 



and Qriz parameters measuring the position of the stellar locus within each field for the 



photo (left) and PSPhot (right) photometric pipelines; zero values arc indicative of uniform photometry. Within the Galactic plane (lower 
panels) , the PSPhot values are more concentrated, but contain a higher number of systematic departures from the main locus. The PSPhot 
code in fact gives a tighter locus at high latitudes as well (upper panels) . Histogram equalization of the gray-scale was used to emphasize 

low density regions. 

where Egri — 1.582 and Eriz = 0.987. These are norniahzed to equal zero at high Galactic latitude (note that these 
colors do not include the u band). 

Median Qgri and Qriz colors in each field were computed for objects identified as stars in each field, and satisfying 
magnitude and color cuts as follows: 14 < {u,g,r,i,z) < 20, 0.5 < {u — g) < 1.9, 0.0 < [g — r) < 1.2, —0.2 < 
(r — i) < 0.8, and —0.2 < (i — z) < 0.6. The Q-paramcters were found to be lower by up to 0.1 mag at low Galactic 
latitudes; to remove this effect, wo fit a model of a constant plus Lorentzian to the median Q values as a function 
of Galactic latitude, and subtracted it. The distributions of the Qgri and Qriz values for both photo and PSPhot 
are compared as density plots in Figure 4. From equation 2, photometric errors in a single filter manifest themselves 
differently: Sg as a shift in Qgri, Sr as a line with slope dQriz/dQgri = —1/(1 + Egri) = —0.35, 6i as a line with slope 
dQriz/dQgri = -(1 + Eriz)/Egri = -1.07, and 5z as a shift in Qriz- 

The photo data in a given field was flagged as bad when either IQgril or |Qriz| > 0.12 mag (> 5a) as measured from 
photo magnitudes, and similarly for the PSPhot outputs. Of course, a field could be fiagged as bad in both sets of 
outputs. By this criterion, about 2% of the fields processed with PSPhot were flagged bad based on the photo outputs, 
and 3.6% were bad based on PSPhot photometry. The vast majority of the flagged fields are within 15° of the Galactic 
plane, and essentially all the fields in which the median difference between photo and PSPhot photometry was greater 
than 0.08 mag in a given band were flagged as bad by the Q criteria. This flag and the Qgri and Qriz quantities 
themselves can bo found in the f ieldQA table in the CAS. 

Although more fields are flagged based on the PSPhot outputs, the PSPhot scatter in Figure 4 is tighter at both 
high and low Galactic latitudes than for photo. The PSPhot stellar photometry is therefore preferred for studies of 
the stellar locus (we have not fully assessed its robustness to outliers), but comes with the caveat that fields flagged 
bad should be identified in the f ieldQA table and be culled. 

An alternative check of SDSS photometry in dense stellar fields was carried out by An et al. (2008), who reduced the 
SDSS imaging data for crowded open and globular cluster fields using the DAOPHOT/ALLFRAME suite of programs 
(Stetson 1987, 1994). At a stellar density of ~ 400 stars deg~^ with r < 20, they found ~ 2% rms variations in 
the difference between photo and DAOPHOT magnitudes in the scanning direction in all five bandpasses (see their 
Fig. 3). The systematic structures are likely due to imperfect modeling of the PSFs in photo, given that DAOPHOT 
magnitudes exhibit no such largo variations with respect to aperture photometry. In other words, the PSF variations 
were too rapid for the photo pipeline to follow over a time scale covered approximately by one field (f» 10' or « 54 sec 
in time). 

An et al. (2008) further examined the accuracy of photo magnitudes in semi-crowded fields using three open clusters 
in their sample. Stellar densities in these fields were as much as ~ 10 times higher than those in the high Galactic 
latitude fields, but photo recovered - 80 - 90% of steUar objects in the DAOPHOT/ALLFRAME catalog. An et 
al. (2008) found that these fields have only marginally stronger spatial variations in photo magnitudes than those at 
lower stellar densities. 



4.3. Further Assessments of Imaging Quality 
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Section 4.6 of the EDR paper describes a series of flags available in the database to assess the quality of each field 
in the imaging data; this includes information on whether the data in a given field meet survey requirements on seeing 
and sky brightness. We have added additional criteria to assess the quality of each field. The CAS table called f ieldQA 
includes a flag called ProblemChar associated with each field, which is set when: 

• The median of the telescope focus over three frames moved more than 60/im, indicating a problem with the 
automated focus of the telescope (Gunn et al. 2006) (Problemchar = 'f '); 

• The rotator angle moved more than 25" between adjacent fields (corresponding to a 0.55" image shift at the 

edge of the camera) (ProblemChar='r'); 

• The astrometric solution shifted by more than 4 pixels (1-6") from a smooth interpolation between adjacent 
fields (ProblemChar= ' a ' ) . 

• Miscellaneous other problems, including voltage problems in the camera, and lights left on in the telescope; this 
was triggered in only two imaging runs (ProblemChar='s'). 

We flag all fields with these problems in any of the five bandpasses. Because the imaging observations are done 
in driftscan mode (Gunn et al. 1998), different areas of the sky are observed simultaneously in each bandpass and 
referenced to the field number of the r-band observation. Thus in the case of focus problems, we mark the 11 fields 
preceding, and the three fields following the field in question in all camera columns in the run as bad. For the rotator 
and astrometric shift problems, we similarly mark the nine preceding and the one following field as bad. Only 0.3% of 
all fields in the CAS are marked with one of these problems (the majority of which are due to focus problems); these 
fiags should be consulted when examining the reliability of the photometry in a given area of sky. 

4.4. Astrometric Recalibration 

Early SDSS imaging runs were astrometrically calibrated against Tycho-2 (H0g et al. 2000), which yielded statistical 
errors per coordinate for bright stars (r < 20) of approximately 75 mas and systematic errors of 20 - 30 mas. Later 
runs were calibrated against preliminary versions of the USNO CCD Astrograph Catalog (UCAC, Zacharias et al. 
2000), which yielded improved statistical errors per coordinate of approximately 45 mas, with systematic errors of 20 
- 30 mas (Pier et al. 2003). Proper motions were not available for the preliminary versions of UCAC. Since the typical 
epoch difference between the SDSS and UCAC observations is a few years and the typical proper motion of UCAC 
stars is a few mas year^^, this introduces an additional roughly 10 mas of systematic error in the positions due to the 
uncorrected proper motions of the calibrating stars. 

All of DR7 has been recalibrated astrometrically against the second data rcileasc of UCAC (UCAC2; Zacharias et 
al. 2004). While the systematic errors for UCAC2 are not yet well characterized, they are thought to be less than 20 
mas (N. Zacharias, private communication). UCAC2 also includes proper motions for stars with 6 < +41°. For stars 
at higher declination, proper motions from the SDSS+USNO-B catalog (Munn et al. 2004) have been merged with 
the UCAC2 positions. With these improvements, all DR7 astrometry has statistical errors per coordinate for bright 
stars of approximately 45 mas, with systematic errors of less than 20 mas. The mean differences per run between the 
old and new calibrations is a function of position on the sky, with typical absolute mean differences of to 40 mas. 
The rms differences are of order 10 to 40 mas for runs previously reduced against UCAC, and 40 to 80 mas for runs 
previously reduced against Tycho-2, consistent with what we would expect given the errors in the reductions. 

Note that the formal SDSS names of objects in the CAS are of the form SDSS Jhhmmss.ssiddmmss.s. Because of 
the siibtle changes in the astrometry, tlic;se names will be slightly different for many objects between DR6 and DR7. 
The user should be aware of this in comparing objects between DR6 and DR7. 

The CAS includes proper motions for objects derived by combining SDSS astrometry with USNO-B positions, 
recalibrated against SDSS (Munn et al. 2004). These are given in the ProperMotions table in the CAS^°^. An error 
was discovered in the proper motion code in Data Releases 3 through 6, which causes smoothly varying systematic 
errors, in the proper motion in right ascension only, of typically 1 — 2 mas year~^ (see Munn et al. 2007 for a full 
description of the problem and its effects). This error has been corrected in DR7, thus any use of proper motions 
should use the DR7 CAS. 

4.5. SEGUE Target Selection 

Several of the SEGUE target selection algorithms evolved during the course of SDSS-II. The most significant changes 
occurred to the K-giant algorithm, as it was realized that good color-based luminosity separation could be done only 
for the very reddest {g — r > 1.1) giant candidates by their deviation from the main sequence locus in the ugr color 
diagram; this of course requires accurate u band photometry. Early K giant target selection included stars with (.9 — r)o 
(where the subscript refers to values after correcting for SFD Galactic extinction) as blue as 0.35. The final selection 
chose stars with {g — r)o between 0.5 and 1.2, and was restricted to go < 18.5; this gives much cleaner samples of K 
giants (Yanny et al. 2009). 

In order to allow users to analyze completeness and efficiency of SEGUE stellar target selection samples, the latest 
(v4.6) version of the algorithms (Yanny et al. 2009) was applied to all stellar objects in the imaging catalog which 



This table was called USNOB in the DR3 and DR4 versions of the CAS. 
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had g < 21 or z < 21, over the entire sky. The appropriate bits were propagated into the SEGUEPrimTarget and 
SEGUESecTarget fields of the photoObjAll table of the DR7 CAS. A description of the bits and the target selection 
algorithms is available in Yanny et al. (2009). 

4.6. Photometric Redshifts 

As described in the DR5 paper, the SDSS makes available the results of two different photometric redshift deter- 
minations for galaxies, one based on neural nets and the other based on a template-fitting approach. With DR7, we 
include improvements to both, as we now describe. 

4.6.1. Photometric Redsliifts with Neural Nets 

The neural net solutions for photometric redshifts and their errors (listed as Photoz2 in the CAS, and described 
in Oyaizu et al. 2008) have not changed since DR6, and do not use the ubercalibrated magnitudes. However, we 

now provide a value-added catalog containing the redshift probability distribution for each galaxy, p{z), calculated 
using the weights method presented in Cunha et al. (2008). The p{z) for each galaxy in the catalog is the weighted 
distribution of the spectroscopic redshifts of the 100 nearest training-set galaxies in the space of dereddened model 
colors and r magnitude. For the p{z) calculation we also added the zCOSMOS (Lilly et al. 2007) and DEEP2-EGS 
(Davis et al. 2007) galaxies to the spectroscopic training set used for the Photoz2 solution. 

Cunha et al. (2008) showed that summing the p{z) for a sample of galaxies yields a better estimation of their true 
redshift distribution than that of the individual photometric redshifts. Mandelbaum et al. (2008) found that this gives 
significantly smaller photometric lensing calibration bias than the use of a single photometric redshift estimate for 
each galaxy. 

4.6.2. Photometric Redshifts: A New Hybrid Technique 

With DR7, we have made substantial improvements in the other photometric redshift code (Photoz), using a hybrid 
method that combines the template fitting approach of Csabai et al. (2003; i.e., the approach used in DR5 and DR6) 
and an empirical calibration using objects with both observed colors and spectroscopic redshifts. We summarize the 
method briefly here, with details to follow in a paper in preparation. 

The spectroscopic sample of SDSS contains over 900,000 spectroscopically confirmed galaxies, and the combination 
of the main sample (Strauss et al. 2001), the LRG sample (Eisenstein et al. 2001) and special plates targeted at fainter 
blue galaxies (DR4 paper) more or less cover the whole color region in which galaxies lie to the depths of SDSS. Thus 
we use the DR7 spectroscopic set as a reference set for redshift estimation without any additional data from synthetic 
spectra. 

The estimation method uses a k-d tree (following Csabai et al. 2007) to search in the ubercalibrated u — g, g — r, 
r — i, i — z color space for the 100 nearest neighbors of every object in the estimation set (i.e. the galaxies for which 
we want to estimate redshift) and then estimates redshift by fitting a local hypcrplane to these points, after rejecting 
outliers. If an object lies outside the bounding box of the 100 nearest neighbors in color space, the photometric redshift 
is less reliable, and the object is flagged accordingly. 

We use template fitting to estimate the K-corrcction, distance modulus, absolute magnitudes, rest frame colors, and 
spectral type. We search for the best match of the measured colors and the synthetic colors calculated from repaired 
(Budavari et al. 2000) empirical template spectra at the redshift given by the local nearest neighbor fit. 

We have found that the mean deviations of the redshifts from the best-fit hyperplane is a good estimate of the 
error. That, together with the flag indicating whether an object lies outside the bounding box of its neighbors, and 
the difference between the estimated photometric redshift and the average redshift of its neighbors, can be used to 
select objects with reliable photometric redshift values. 

The rms error of the redshift estimation for the reference set decreases from 0.044 in DR6 to 0.025 in DR7 with this 
improved algorithm (Figure 5). Iteratively removing the outliers beyond 3cr gives rms errors of 0.028 and 0.020 for 
the old and new methods, respectively. In addition, the reliability of the quoted errors is much higher. 

4.7. SDSS Filter Response Functions 

The response functions of the SDSS imager as a function of wavelength have been monitored throughout the survey. 
The griz responses were stable over time, although very small seasonal (i.e., temperature) variations were observed, 
at a level well below our typical photometric errors. However, we have found a relatively large change in both the 
amplitude and shape of the «-band response, which is likely due to a degradation of the UV enhanced coating of the 
u-band CCD. This change in instrumental zero-point is effectively corrected by the photometric calibration for objects 
near the mean color of the standard stars, and, in fact, the repeat photometry of stars in stripe 82 is stable with time 
for stars with —0.5 < g — r < 1.5 (Bramich et al. 2008; Ivezic et al. 2007). However, the observed response changes 
involve a roughly 30 A redward shift in the effective wavelengths of the u filters over the lifetime of the survey, so one 
would expect significant changes in the measured colors of objects of extreme color over the period, and this is being 
investigated. Doi et al. (in preparation) will summarize the filter characteristics in full, including column-to-column 
variations within the camera and the changes with time. 
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Fig. 5. — The template-based estimated redshifts versus the true spectroscopic redshifts for a random sample of 30,000 galaxies with 
redshifts from SDSS. The estimated values calculated with the old (DR6) method has significantly larger scatter and more outliers than the 
ones with the new hybrid (DR7) technique. Note that the sample is dominated by red galaxies (whose photometric redshifts are intrinsically 
easier to measure) at z > 0.2. 

As described in the DR6 paper and Mandelbaum et al. (2005), systematic errors in the estimation of the sky near 
bright (r < 16) galaxies causes their fluxes and scale sizes to be underestimated and the number of neighboring objects 

to be suppressed. Indeed, a number of authors (Lauer et al. 2007, Bernardi et al. 2007, Lisker et al. 2007) have pointed 
out systematic errors in SDSS galaxy photometry at the bright end. In the DR6 paper, this effect was quantified 
by adding simulated galaxies to the SDSS images using a code described in Masjedi et al. (2006). These simulations 
found that the r band brightness of galaxies was underestimated by as much as 0.8 magnitudes for a 12th magnitude 
galaxy with Sersic index, n = l (an "exponential", or disk galaxy). For n = 4 galaxies ("de Vaucouleurs" , or elliptical 
galaxies), the effect was less pronounced, with a brightness underestimate of less than 0.6 magnitudes. 

However, the simulations shown in the DR6 paper used an incorrect relation between galaxy size and magnitude, 
in the sense that they overestimated the extent of the problem for the typical galaxy. Using instead the relationships 
between apparent magnitude and half-light radius measured for SDSS bulge and disk galaxies (Blanton et al. 2003), 
we repeated the exercise: we simulated pure n = 1 and n = 4 galaxies with axis ratios b/a of 0.5 and 1, and added 
them to real r-band SDSS images. We ran the results through photo and compared their measured model magnitudes 
to their true magnitudes; the bias in the measurement is shown as a function of true magnitudes in Figure 6. There 
is appreciable scatter at a given magnitude, due both to the changing background and the different axis ratios. On 
average, however, the flux is underestimated by approximately 0.2 magnitudes at r = 12.5 and < 0.1 magnitudes at 
r = 15 for simulated galaxies with a Sersic index of 1. For a Sersic index of 4, the flux is underestimated by as much 
as 0.3 magnitudes at r = 12.5. The effect is more severe for simulated objects with an axis ratio of 1 than for an axis 
ratio of 0.5 (see Figure 6). The scale sizes of galaxies are similarly underestimated by as much as 20% for simulated 
galaxies with Sersic index of 1, and 30% for an index of 4. Of course, the most massive elliptical or cD galaxies will 
have more extended envelopes, producing a larger effect than we have found here (Lauer et al. 2007). 

6. IMPROVEMENTS IN PROCESSING OF SPECTROSCOPIC DATA 

6.1. Correction of Instability in the Spectroscopic Flats 

Spectroscopic flatfields for the blue camera in the first spectrograph contain an interference pattern produced by the 
dichroic. The thickness of the dichroic coating is believed to be sensitive to the ambient humidity, and moisture which 
enters the system during plate changes affects the instrument response, shifting the interference pattern in wavelength 
in unpredictable ways on timescales comparable to the 900s exposure time. The flats applied in processing were 



SDSS DR7 



15 




I I I I I I I I I I I I I I I I I I I I I I I I I I l_ 

12 13 14 15 16 17 

r Magnitude 

Fig. 6. — Difference between measured model and true r-band magnitudes of a series of simulated galaxies with Sersic index of 1 (disk 
galaxies; upper panel) and 4 (elliptical galaxies; lower panel). These galaxies followed the magnitude-effective radius relation observed in 
the SDSS Value-Added Galaxy Catalog (Blanton et al. 2005), and were either circularly symmetric (circles) or had an axis ratio of 0.5 
(diamonds). They were added to random areas of real high-latitude fields, and run through photo. The simulated elliptical galaxies show 
a systematic offset even at the faint end; this is due to the fact that the photo model magnitude code assumes a truncation beyond 7 
scale-lengths, while the "true" magnitude has no such truncation. This is a 0.05 mag effect. 

exposed several minutes prior to, or after, the science frames and therefore were not always representative of the true 
instrument response at the time of exposure. The interference pattern is most pronounced in the 3800-4100A region 
of the spectrum. If it shifts during an exposure, it will not be properly corrected by the Hatfield, causing significant 
distortion of blue absorption lines in stellar spectra, and systematically affecting estimates of metallicities and surface 
temperatures. 

Flats obtained under different conditions were used to identify and model the stable and unstable (shifting) com- 
ponents of the flat, as shown in Figure 7. With this model in hand, we searched for shifts in the interference pattern 
over the typically 45 minute time a given plate was observed by comparing the results of the individual 15-minute 
exposures for each object. Thus we took ratios of the extracted spectra from the separate exposures, and computed 
the median over all objects on a plate, giving results like those on the left-hand side of Figure 8. We fit this ratio to the 
results expected from a shifting interference pattern (essentially a derivative of the shifting component in Figure 7), 
with the only free parameter being the amount of shift, and divided out this remaining component in each spectrum. 
The right-hand panel of Figure 8 shows that this technique removes the majority of the effects of the shifting inter- 
ference. An example is shown in Figure 9, the spectrum of an A star observed on a plate where the interference term 
was particularly bad. The shapes of the absorption lines, especially He at 3970A, is much more regular in the new 
reductions. 

6.2. Wavelength Calibration 

The spectroscopic wavelength calibration is done quite accurately in SDSS, with typical errors of 2 kms^^ or 
better. As the DR6 paper describes, however, detailed analyses of stellar spectra revealed occasional errors that were 
substantially larger than this, especially in the blue end of the spectrum. The algorithms for fitting arc and sky lines 
were made more robust for DR6, and this improved the situation considerably. We have implemented two further 
improvements for DR7: 

• Spectroscopy is often done on nights with a moderate amount of moon. The bluest sky line used for wavelength 
calibration is a Hg line at 4046A, which is very close to a strong Fe I absorption line in the solar spectrum. Thus 
when there is substantial moonlight in the sky spectrum, a fit to what is assumed to be an isolated emission 
line can be significantly biased, systematically skewing the wavelength solution at the blue end by as much as 
20 kms~^. In DR7, we now fit this line to a linear combination of a Gaussian plus a stellar template including 
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wavelength shift. 

the absorption line, giving an unbiased estimate of the wavelength of the line. In practice, bright moon affected 
10 plates (listed in Yanny et al. 2009) out of a total of 410 SEGUE plates. 

• The sky and arc lines for each fiber are fit to a wavelength solution; this is done independently for each fiber. 
This works well for the vast majority of plates. However, for a small fraction of plates, the arcs are weak (perhaps 
because the arc lamps themselves were faulty at that time, or because the petals which reflect the arc lamp light 
were not properly deployed), and the wavelength solution is poorly constrained. We therefore required that 
second- and higher-order terms in the wavelength solution be continuous functions of fiber number, to constrain 
the solution. We found that this produces much more robust wavelength solutions for those plates with weak 
arc observations, and has no substantial effect on the remaining plates. 

The stellar spectral template library which gives the best radial velocity estimates is based on the ELODIE library 
(Prugniel & Soubiran 2001). We have removed one ELODIE template that gave velocities with a consistent offset from 
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Fig. 9.— The spectrum of SDSS J172637.26+264127.6, an AO star observed as part of SEGUE. The strong broad Unes are due to Balmer 
absorption. The red dashed spectrum is that available in DR6, while the black solid spectrum is from DR7, with its improved fiat-field. 

the rest of the library, as measured using the sample of ^ 5000 stars with duplicate observations on each SEGUE plate 
pair. In order to provide more complete coverage in effective temperature, surface gravity and metallicity for hot stars, 
we generated a grid of synthetic spectra using the models from Castelli & Kurucz (2003) over the same wavelength 
range and at the same resolving power as the spectra in the ELODIE library. This blue grid spans 6000~9500K in 500K 
increments, —0.5 > [Fe/H] > —2.5 in increments of 0.5 dex, and \ogg of 2 and 4. We also added a grid of synthetic 
carbon enhanced spectra (Plez, private communication, using the stellar atmospheric code described by Gustafsson et 
al. 2008) at values of [Fe/H] between —1 and —4, [C/Fe] between 1 and 4, \ogg values between 2 and 4, and Te// in 
the range 4000K-6000K. With these improvements, the radial velocity scatter in repeat observations for objects that 
match the Carbon star templates is now the same as for the full sample. 

The DR6 paper describes a 7 kms~^ systematic error in the radial velocities of stars (in the sense that the pipeline- 
reported velocities are too small) . This is still with us in DR7; a correction is put into the outputs of the SEGUE Stellar 
Parameter Pipeline (Lee et al. 2008a) but not elsewhere in the CAS or DAS. Beyond this problem, the plate-to-plate 
velocities of SEGUE stars have systematic errors of about 2 kms^^ in the mean. The rms velocity error of any given 
SEGUE star observation is about 5.5 kms"^ at g = 18.5, degrading to 12 kms~^ at g = 19.5. 



6.3. Strong Unresolved Emission Lines 

The spectroscopic pipeline combines observations of a given object on the red and blue spectrographs, and between 
the separate 15-minute exposures on the sky, by fitting a tightly-constrained spline to the data, allowing discrepant 
points such as cosmic rays to be rejected. This spline requires as input the effective resolution of the spectra. As 
described in the DR6 paper, it did not do a perfect job; occasionally, very strong and sharp emission lines were 
erroneously rejected by this algorithm. This turned out to be due to the fact that the spline code did not adequately 
track the changing resolution of the spectra as a function of wavelength and fiber number. Including this effect 
significantly improved the behavior of this algorithm. Figure 10 shows an example spectrum of an object affected by 
this problem in DR6, and its improved counterpart in DR7, as is apparent by the correct 3:1 ratio of the 5007A and 
4959A hues of [OIII]. 

There is another problem, unfortunately not fixed in DR7, which has a similar effect. If the line is so bright that 
it is saturated in the individual 15-minute exposures of the spectrograph, it will also appear clipped. The flux value 
corresponding to saturation is a function of wavelength, but ranges from 2000 to 10,000 times 10~^^ erg s~^ cm~^ 
(the units in which spectral flux density in reported in the SDSS outputs). Unfortunately, such saturated pixels are 
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Fig. 10.— Spectra of the object SDSS J153704.18+551550.6=Mrk487, in DR6 and DR7. The stronger [OIII] emission line at 5020A was 
mistaken for a cosmic ray and clipped away completely in DR6, while the wealjer line at 4970A was slightly aifected. With the improved 
algorithm in DR7, the lines are not clipped. 

not flagged as such, although usually they are recognizable as having an inverse variance equal to zero. Luckily, objects 
with such strong emission lines are very rare, but the user should be aware of the possibility of objects with extremely 
strong emission lines and unphysical or unusual line ratios. 



6.4. Improvements in the SEGUE Stellar Parameter Pipeline 

There have been several improvements made in the SEGUE Stellar Parameter Pipeline (SSPP; Lee et al. 2008a; 

2GG8b; AUendc Pricto ct al. 2GG8a) since the release of DR6. In particular, in DR6, the SSPP undcr-cstimatcd 
metallicitics (by about G.3 dcx) for stars approaching solar mctallicity. This was fixed in DR7 by adding synthetic 
spectra with super-solar metallicitics to two of the synthetic grid matching techniques (NGSl and NGS2), and by 
recalibrating the CaIIK2, ACF, CallT, and ANNRR methods. See Table 5 in Lee et al. (2008a) for the naming convention 
for each technique. Two new techniques (ANNRR and CaIIK3) were also added to the SSPP metallicity estimation 
schemes, and contributed to the high-metallicity performance improvement. 

Two methods, ACF and CallT, have been recalibrated to the "native" g—r system, instead of making use of calibration 
on _B — y, which required application of an uncertain transformation in color space. The ANNRR approach, which also 
tended to under-estimate metallicity for near-solar metallicity stars, has been re-trained on the SDSS/SEGUE spectra 
with improved stellar parameters, resulting in a better determination of the metallicity for metal-rich stars. Moreover, 
a neural network approach, based solely on noise-added synthetic spectra, has also been introduced. There remains a 
tendency for the SSPP to assign slightly higher metallicitics for stars with [Fe/H] < —2.7. This offset is presently being 
calibrated out. and will be corrected in SEGUE-2: see below. For more detailed descriptions of individual methods of 
the SSPP, we refer the interested reader to Lee et al. (2008a). 

Additionally, the pipeline now identifies cool main sequence stars of low metallicity (late-K and M subdwarfs). 
The stars are assigned mctallicity classes and spectral subtypes following the classification system of Lcpinc et al. 
(2GG7). Cool and ultra-cool subdwarfs are classified as subdwarfs (sclK, sdM). extreme subdwarfs (csdK, csdM), and 
ultrasubdwarfs (usdK, usdM) in order of decreasing metal content. The classification is based on the absolute and 
relative values of the TiO and CaH molecular bandstrengths, and derived from fits to K-M dwarf and K-M subdwarf 
spectral templates. 

A number of open and globular clusters have been observed photometrically and spectroscopically with the SDSS 
instruments to evaluate the performance of the SSPP (Lee et al. 2008b). In addition, high-resolution spectra have been 
obtained for about 100 field stars included in the SDSS, and used to expand the SSPP checks over a wider parameter 
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space (Allende Prieto et al. 2008a). 

7. LOOKING AHEAD TO SDSS-III 

This paper marks the release of the final data of SDSS-II. The original SDSS science goals (York et al. 2000) 

included five-band imaging over 10** deg^ with 2% rms errors or better in photometric calibration, and spectroscopy of 
10^ galaxies and 10"'' quasars. We have met these goals, and have in addition carried out extensive stellar spectroscopy 
of close to half a million stars, and repeat imaging over 250 dcg^ to search for supernovae. Over 2200 refereed papers 
have been published to date using SDSS data or results, on subjects ranging from the large-scale distribution of galaxies 
to distant quasars to substructure in the Galactic halo to surveys of white dwarfs to the color distribution of main 
belt asteroids. 

The SDSS telescope has started a new operational phase, called SDSS-III, which will include four surveys with the 
2.5m telescope through 2014: 

• SEGUE-2 extends the science goals of SEGUE with the same instrumentation and data processing pipelines, but 
targets fainter stars to study the distant halo. It will increase the number of distant halo stars by a factor of 2.5 
with respect to the results of SDSS and SDSS-II. 

• The Baryon Oscillation Spectroscopic Survey (BOSS) will perform spectroscopy of 1.5 million luminous red 
galaxies to z w 0.7 and 160,000 quasars with 2.3 < z < 3 to measure the scale of the baryon oscillation signal in 

the correlation function as a function of rcdshift (Schlcgel et al. 2007). 

• The Multi-object APO Radial Velocity Exoplanet Large-area Survey (MARVELS) will monitor the radial veloc- 
ities of 11,000 bright stars to search for the signature of planets with periods ranging from several hours to two 

years (Ge et al. 2008). 

• The APO Galactic Evolution Experiment (APOGEE) will perform R « 20, 000 i?-band spectroscopy of 10^ 
giant stars \,o H = 13.5 for detailed radial velocity and chemical studies of the Milky Way (Majewski et al. 2008; 
Allende Prieto et al. 2008b). 

These data will be made public in a series of data releases, following the pattern established by SDSS and SDSS-II. 

This paper represents the end of SDSS-II, the culmination of a project taking two decades and involving an enormous 
number of scientists from all over the world. We would like to dedicate this paper to colleagues who made essential 
contributions to the SDSS but are no longer with us: John N. Bahcall, Don Baldwin, Norm Cole, Arthur Davidsen, 
Jim Gray, Bohdan Paczyhski, and David N. Schramm. The successful completion of this project is in large part a 
reflection of the hard work and intellectual capital they put into it. 

Funding for the SDSS and SDSS-II has been provided by the Alfred P. Sloan Foundation, the Participating In- 
stitutions, the National Science Foundation, the U.S. Department of Energy, the National Aeronautics and Space 
Administration, the Japanese Monbukagakusho, the Max Planck Society, and the Higher Education Funding Council 
for England. The SDSS Web Site is http://www.sdss.org/. 
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Astrophysics, the Kavli Institute for Particle Astrophysics and Cosmology, the Korean Scientist Group, the Chinese 
Academy of Sciences (LAMOST), Los Alamos National Laboratory, the Max-Planck-Institute for Astronomy (MPIA), 
the Max-Planck-Institute for Astrophysics (MPA), New Mexico State University, Ohio State University, University of 
Pittsburgh, University of Portsmouth, Princeton University, the United States Naval Observatory, and the University 
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